\section{Methods}
Before playing the game, the player must pass a screening test to ensure that he is in the target group (see Section \ref{targetGroup}). During the experiment, a range of different methods are used for triangulation purposes \citep{pvLogic} to test the player's understanding and enjoyment. Since it might be difficult for the player to recall different parts of the game after a playthrough, the player is stopped six times at pre-determined checkpoints. Here, the player talks about the puzzle he just finished. This was conducted as a semi-structured interview \citep{InteractDesign}.

A downside of using this method is that players might fall out of flow \citep{flow} due to interruptions. An alternative method would be to video record the session and make the player elaborate on it afterwards. However, this method was disregarded as it would cause the experiment to take considerably longer time. Another downside is that since the game in itself can take up to 20 minutes to complete, interrupting the player makes an already long experiment even longer. After playing through the game, the player is asked to fill out a questionnaire (see Appendix B). It should be noted that the player uses mouse and keyboard as well as a headset, when playing the game. Mouse and keyboard are used instead of a controller, since we expect that most experienced gamers are more familiar with mouse and keyboard. Furthermore, the player uses a headset to be more isolated in the experience and become immersed in the game.

\subsection{Screening}
Before the screening, the test participant is asked how familiar he is with first-person controlled videogames, such as \textit{Half-Life 2} and \textit{Minecraft}, etc. Only if the person seems familiar with these types of games, he will begin the screening. This is done to see if the person fits the target group. In the screening, the person will be asked to solve simple tasks using the same controls as in the game, however without the element of light and shadows (see Figure \ref{fig:ScreeningLevel}). His performance is evaluated by the facilitator. The facilitator fills out a form during the experiment (see Appendix A) If the participant passes the test, he is then asked to play the actual game. Otherwise, he is told that he doesn't fit the target group and therefore cannot participate in the test.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.50\textwidth]{Pictures/Design/ScreeningLevel.png}
\caption{The player can only walk, look around, jump, and pick up the box.}
\label{fig:ScreeningLevel}
\end{figure}

\subsection{Understanding}
To test understanding, it is important to observe the player's process when solving puzzles in the game. The overall goal is to test the player's understanding and how well his mental model matches the designer's conceptual model. Here, it is important that the player's mental model contains a practical understanding of how lights and shadows work in the game universe. 

A common technique that enables explicit observation of the player's way of thinking during problem solving is to get the player to think out loud \citep{InteractDesign}. This technique was considered. However, since this might influence the player's experience by focusing too much on describing their actions, it was instead decided to make the player explain his way of thinking upon solving a puzzle. This also allows the facilitator to ask guiding questions such as "How did you solve the puzzle?", and "How did you come up with that solution?" without interrupting the process of solving said puzzle. The questions are asked and answered verbally, as opposed to written, to reduce experiment time.

By listening to the player's explanations on how he solved a puzzle, the facilitator makes an estimate on how well the player understands the rules in the game. The data gathered from these questions are subjective and qualitative. However, the facilitator then quantifies the data by giving it a score from 1-10.

The first questions the player is given in the questionnaire after the playthrough are directly aimed at testing the player's understanding of the game universe. This set of questions will be referred to as \textit{the comprehension test}. 

The questions are set up as multiple choice questions and assume scenarios that could occur in the game universe. For each question, there is one correct answer, some less correct answers, and some wrong answers. A correct answer gives 4 points, a less correct answer gives 1 or 2 points, and a wrong answer gives either 0 or -1 points. This way, it is possible to test the player's understanding of the rules. This was inspired by the exam that players of Wizards of the Coast's \textit{Magic: The Gathering} need to take to become judges at official tournaments \citep{Magic}.

After answering the test questions, the player is asked to describe the core concept of the game as short as possible. In the questionnaire, the words "core concept" are used since "rules", which is used in this report, might be misinterpreted. This question tests if the player's description of the rules is correct. Even if the player saw the rules in the beginning, it is not given that the player understands the rules and can describe it. It is possible that the player just recalls the sentence from the beginning, however this will be obvious when comparing with the player's test score.

With the three different methods of testing understanding, a general score for understanding will be assessed for each player.

\subsection{Enjoyment}
As the players are interrupted during the game, they are also asked to assess how enjoyable their experience was on a Likert scale (see Figure \ref{fig:ExperienceScale}) and elaborate on their answer. The player is asked to rate their own enjoyment, as this is difficult for the facilitator to determine without using any physiological methods.

\begin{figure}[htbp]
\centering
\includegraphics[width=1\textwidth]{Pictures/Design/ExperienceReference}
\caption{The test participants were asked to rate their experience using this scale.}
\label{fig:ExperienceScale}
\end{figure}

Furthermore, in the questionnaire after the game, the players are asked to rate the overall experience for how enjoyable it was, even though the players have already rated each individual puzzle. This is done for triangulation purposes.

By averaging the different enjoyment scores, a final score for enjoyment will be assessed for each player.

%\subsection{Possible factors}
%Besides measuring understanding and enjoyment, the experiment gathers possible factors through the questionnaire and data logged by the game.
%
%After the playthrough, the test participants are asked the following question: "How much more enjoyable/entertaining do you think the game would have been with/without a written explanation of the core concept?". This is measured on a scale from 1 to 10, where 1 is "Much less enjoyable/entertaining" and 10 is "Much more enjoyable/entertaining". This can be used for investigating what participants think would be most enjoyable instead of directly testing it
%
%The player is interrupted and asked to assess his enjoyment and the difficulty of the puzzles on a Likert scale, which then can be used to measure how much the difficulty influences the enjoyment.
%
%In the questionnaire, a few questions are asked regarding what kind of a gamer the player is. For instance, the questionnaire asks if the player normally gives up early or late, if he likes puzzle games, which controller type he prefers, and how many hours a week he plays games on average. These answers can be used to measure if for instance the game is only enjoyable for more avid puzzle gamers. In the end of the questionnaire, the player is asked about age and gender to see if this might have an influence.
%
%The logging data from the game is mostly used for measuring the player's performance. This is data such as amount of time spent on each puzzle, as well as amount of deaths. This could be used for i.e.\ seeing which group was actually best at the game.